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Abstract 

The random greedy algorithm for constructing a large partial Steiner- Triple-System 
is defined as follows. We begin with a complete graph on n vertices and proceed to 
remove the edges of triangles one at a time, where each triangle removed is chosen 
uniformly at random from the collection of all remaining triangles. This stochastic 
process terminates once it arrives at a triangle-free graph. In this note we show that 
with high probability the number of edges in the final graph is at most O {n^/^ log^^'^ n) . 



1 Introduction 

We consider the random greedy algorithm for triangle-packing. This stochastic graph process 
begins with the graph G'(O), set to be the complete graph on vertex set [n], then proceeds to 
repeatedly remove the edges of randomly chosen triangles (i.e. copies of K-^) from the graph. 
Namely, letting G{i) denote the graph that remains after i triangles have been removed, the 
(i + l)-th triangle removed is chosen uniformly at random from the set of all triangles in G{i). 
The process terminates at a triangle- free graph G{M). In this work we study the random 
variable M, i.e., the number of triangles removed until obtaining a triangle- free graph (or 
equivalently, how many edges there are in the final triangle- free graph). 

This process and its variations play an important role in the history of combinatorics. 
Note that the collection of triangles removed in the course of the process is a maximal 
collection of 3-element subsets of [n] with the property that any pair of distinct triples in the 
collection have pairwise intersection less than 2. For integers t < k < n a partial {n,k,t)- 
Steiner system is a collection of fc-element subsets of an n-element set with the property that 
any pairwise intersection of sets in the collection has cardinality less than t. Note that the 
number of sets in a partial (n, k, t)-Steiner system is at most (") / (^) . Let S{n, k, t) be the 
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maximum number of fc-sets in a partial (n, fc, t)-Steiner system. In the early 1960's Erdos 
and Hanani [5] conjectured that for any integers t < k 

it) 

In words, for any t < k there exist partial (n, k, t)-Steiner systems that are essentially as large 
as allowed by the simple volume upper bound. This conjecture was proved by Rodl [7] in 
the early 1980 's by way of a randomized construction that is now known as the Rodl nibble. 
This construction is a semi-random variation on the random greedy triangle-packing process 
defined above, and thereafter such semi-random constructions have been successfully applied 
to establish various key results in Combinatorics over the last three decades (see e.g. [1] for 
further details). 

Despite the success of the Rodl nibble, the limiting behavior of the random greedy packing 
process remains unknown, even in the special case of triangle packing considered here. Recall 
that G{i) is the graph remaining after i triangles have been removed. Let E{i) be the edge set 
oiG{i). Note that \E{i)\ = Q -3i and that E{M) is the number of ed ges in the triangle-free 
graph produced by the process. Observe that if we show |-E'(M)| = o(?2^) with non-vanishing 
probability then we will establish ([1]) for k = 3,t = 2 and obtain that the random greedy 
triangle-packing process produces an asymptotically optimal partial Steiner system. This is 
in fact the case: It was shown by Spencer [9] and independently by Rodl and Thoma [7] 
that |£'(M)| = o{n?) with high probabilit}0. This was extended to \E{M)\ < n^^^+^W by 
Grable in [6], where the author further sketched how similar arguments using more delicate 
calculations should extend to a bound of 12"^/^+°^ w.h.p. 

By comparison, it is widely believed that the graph produced by the random greedy 
triangle-packing process behaves similarly to the Erdos-Renyi random graph with the same 
edge density, hence the process should end once its number of remaining edges becomes 
comparable to the number of triangles in the corresponding Erdos-Renyi random graph. 

Conjecture (Folklore). With high probability \E{M)\ = n^/2+o(i)_ 

Joel Spencer has offered $200 for a resolution of this question. 

In this note we apply the differential-equation method to achieve an upper bound on 
E{M). In contrast to the aforementioned nibble-approach, whose application in this setting 
involves delicate calculations, our approach yields a short proof of the following best-known 
result: 

Theorem 1. Consider the random greedy algorithm for triangle-packing on n vertices. Let 
M he the number of steps it takes the algorithm to terminate and let E{M) be the edges of 
the resulting triangle-free graph. Then with high probability, \E{M)\ = O (n7 ^ \og^ ^ ^ rij . 



^Here and in what follows, "with high probability" (w.h.p.) denotes a probability tending to 1 as n ^ 00. 
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Wormald [TT] also applied the differential-equation method to this problem, deriving an 
upper bound of n'^~'^ on E{M) for any e < eo = 1/57 while stating that "some non-trivial 
modification would be required to equal or better Grable's result." Indeed, in a companion 
paper we combine the methods introduced here with some other ideas (and a significantly 
more involved analysis) to improve the exponent of the upper bound on E{M) to about 1.65. 
This follow-up work will appear in [3]. 

2 Evolution of the process in detail 

As is usual for applications of the differential equations method, we begin by specifying the 
random variables that we track. Of course, our main interest is in the variable 

Qi'i) = # of triangles in G{i) . 
In order to track Q{i) we also consider the co-degrees in the graph G{i): 

Yu,v{'i) = \{x G [n] : xu,xv G E{i)}\ 

for all {u,v} G ('2^)- Our interest in Yu^v is motivated by the following observation: If the 
{i + l)-th triangle taken is abc then 

Q{t + 1) - Q{t) = Ya,b{^ + n,c(«) + YaA^ - 2 . 

Thus, bounds on Y^^v yield important information about the underlying process. 

Now that we have identified our variables, we determine the continuous trajectories that 
they should follow. We establish a correspondence with continuous time by introducing a 
continuous variable t and setting 

t = i/ri^ 

(this is our time scaling). We expect the graph G{i) to resemble a uniformly chosen graph 
with n vertices and (2) ~ 3z edges, which in turn resembles the Erdos-Renyi graph Gn,p with 

p = 1 — 6z/n^ = p{t) = 1 — 6t . 

(Note that we can view p as either a continuous function of t or as a function of the discrete 
variable i. We pass between these interpretations of p without comment.) Following this 
intuition, we expect to have Yu^i) ~ p^n and Q{i) ~ p^n^/6. For ease of notation define 

yit)=p\t) , g(t)=p3(t)/6. 

We state our main result in terms of an error function that slowly grows as the process 
evolves. Define 

f(t) = 5-30 log(l - 6t) = 5 - 30 logp(t) . 
Our main result is the following: 
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Theorem 2. With high probability we have 

Q(i) > ,(«)„3 _ ™rf (2) 

- V(t)n\ < f(t)~yn\ogn for all lu,v} e ('J) , (3) 

holding for every 

Furthermore, for all i = 1, . . . , M we have 

Q{i) < q{t)n^ + \n^p{t) . (4) 

Note that the error term in the upper bound (jl]) decreases as the process evolves. This is 
not a common feature of apphcations of the differential equations method for random graph 
process; indeed, the usual approach requires an error bound that grows as the process evolves. 
While novel techniques are introduced here to get this 'self-correcting' upper bound, two 
versions of 'self-correcting' estimates have appeared to date in applications of the differential 
equations method in the literature (see |1] and [10]). The stronger upper bound on the 
number of edges in the graph produced by the random greedy triangle-packing process given 
in the companion paper [3j is proved by establishing self-correcting estimates for a large 
collection of variables (including the variable Y^^v introduced here). 

Observe that ([2]) (with i = zq) establishes Theorem [TJ We conclude this section with a 
discussion of the implications of (jl]) for the end of the process, the part of the process where 
there are fewer then n^/^ edges remaining. Our first observation is that at any step i we can 
deduce a lower bound on the number of edges in the final graph; in particular, for any i we 
have E{M) > E{i) — 3Q{i). We might hope to establish a lower bound on the number of 
edges remaining at the end of the process by showing that there is a step i where E{i) — 3Q{i) 
is large. The bound dl]) is (just barely) too weak for this argument to be useful. But we can 
deduce the following. Consider i = — 0(n^/^); that is, consider p = cn~^l'^ . Once c is 
small enough the upper bound (jl]) is dominated by the 'error' term v^pj?). If Q remains close 
to this upper bound then for the rest of the process we are usually just choosing triangles 
in which every edge is in exactly one triangle; in other words, the remaining graph is an 
approximate partial Steiner triple system. If Q drops significantly below this bound then 
the process will soon terminate. 



3 Proof of Theorem [2 

The structure of the proof is as follows. For each variable of interest and each bound 
(meaning both upper and lower) we introduce a critical interval that has one extreme at 
the bound we are trying to maintain and the other extreme slightly closer to the expected 
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trajectory (relative to the magnitude of the error bound in question). The length of this 
interval is generally a function of t. If a particular bound is violated then sometime in the 
process the variable would have to 'cross' this critical interval. To show that this event 
has low probability we introduce a collection of sequences of random variables, a sequence 
starting at each step j of the process. This sequence stops as soon as the variable leaves 
the critical interval (which in many cases would be immediately), and the sequence forms 
either a submartingale or supermartinagle (depending on the type of bound in question). 
The event that the bound in question is violated is contained in the event that there is an 
index j for which the corresponding sub/super-martingale has a large deviation. Each of 
these large deviation events has very low probability, even in comparison with the number 
of such events. Theorem [2] then follows from the union bound. 

For ease of notation we set 

^o = in^ - |n^/4 log'^^ n, po = lOn-'/' log^/' n . 

Let the stopping time T be the minimum of M and the first step i < io at which ([2]) or 
([3]) fail and the first step i at which (jlj) fails. Note that, since Y^^y decreases as the process 
evolves, if «o < < ^ then we have 

Y^^y{i) = 0{n'/^ log'/' n) for all {u, v} G (^J) . 

We begin with the bounds on Q{i). The first observation is that we can write the expected 
one-step change in Q as a function of Q. To do this, we note that we have 

mQ] = - E = 2-1^^4 (5) 

xyzdQ xyGE 

and 

3Q = ^ ^ Yxy ■ 

xy^E 

(And, of course, \E\ = n'^p/2 — n/2.) Observe that if Q grows too large relative to its 
expected trajectory then the expected change will be become more negative, introducing 
a drift to Q that brings it back toward the mean. A similar phenomena occurs if Q gets 
too small. Restricting our attention to a critical interval that is some distance from the 
expected trajectory allows us to take full advantage of this effect. This is the main idea in 
this analysis. 

For the upper bound on Q{i) our critical interval is 

(g(t)n^ + , qit)n^ + • (6) 

Suppose Q{i) falls in this interval. Since Cauchy-Schwartz gives 

/ ^ ■ — ■ 



■ 3:y 

xy&E 



\E\ - n'^p/2 
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in this situation we have 

E[Qit + 1) - Q(z) I Git)] < 2 - ^ < 2 - 3V _ ^ = -3np' - ^ . 

Now we consider a fixed index j. (We are interested in those indices j where Q{j) has just 
entered the critical window from below, but our analysis will formally apply to any j.) We 
define the sequences of random variables X {j) , X {j + 1) , . . . , X (Tj) where 

Xi^) = Qii) -qn'-"^ 

and the stopping time Tj is the minimum of max{j, T} and the smallest index i > j such 
that Q{i) is not in the critical interval ([6]). (Note that if Q{j) is not in the critical interval 
then we have Tj = j.) In the event j < i < Tj we have 

E[X{t + 1) - X(z) I Gi{)] = E[Q{t + 1) - Q{i) I G{{}] - {q{t + - q{t)) 

-{p{t + l/n^)-p{t)) — 
5 

< -Mp^ - - + 3np^ + 2 + (1/n) 

< 0. 

So, our sequence of random variables is a supermartingale. Note that if Q{i) crosses the 
upper boundary in (jlj) at i = T then, since the one step change in Q{i) is at most 3n, there 
exists a step j such that 

A-W<-^^^+0(„) 

while T = Tj and X(T) > 0. We apply Hoeffding-Azuma to bound the probability of such 
an event: the number of steps is at most n^p{t{j))/Q and the maximum 1-step difference is 
0(r2^/^ log^^^ n) (as i < T implies bounds on the co-degrees). Thus the probability of such a 
large deviation beginning at step j is at most 

(n2p(t(j))) ■ (nVMog'/'n) / J I V log^ /J 

As there are at most n"^ possible values of j, we have the desired bound. 

Now we turn to the lower bound on Q, namely ([2]). Here we work with the critical interval 

\ P P J 

Suppose Q{i) falls in this interval for some i < T. Note that our desired inequality is in 
the wrong direction for an application of Cauchy Schwartz to (|5]). In its place we use the 
control imposed on Yu^^^i) by the condition i < T. For a fixed 3Q = J2uveE'^u,v, the sum 
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'^uvge ^uv is maximized when we make as many terms as large as possible. Suppose this 
allows a terms in the sum Ylixy(^E ^qual to np^ + f^Jn\ogn and a + {3 terms equal to 
f^P^ ~ fy/nJogn. For ease of notation we view as rationals, thereby allowing the terms 
in the maximum sum to split completely into these two types. Then we have 



f3f\/n logn = \E\ ■ np^ — 3Q = 3qn^ — 3Q 



Therefore, we have 

Y^y < a (^np'^ + f^/n\ogn^ + [a + /3) (^np'^ — / ^/n log n j 



xy£E 



n p n\ 2 4 / ^ P 
n p + ' 



■ fn log n - 2Pfp^n^l'^ log^^^ 



n 



n^p^ _l_ Ppn^ log n 1 



< 'onp Q 



2 n^p^ ^ ppn^logn ^ n^p'^ 



+ 



+ 



Now, for j < io define Tj to be the minimum of tQ, max{j, T} and the smallest index i > j 
such that Q{i) is not in the critical interval (171). Set 



X{{) = Q{{} - q{ty + 



3 ^ /(t)^n^logn 



Pit) 



For j < i < Tj we have the bound 

E[X(^ + 1) - X{t) I = E[g(2 + 1) - Q{t) I GW] - ri3(g(t + l/n^) - 



> 2 - 6np^ + 



p{t + l/n^) p{t) J 

^4p5 ppn^ log n 



n logn 



2Q 



2Q 



+ 0(p) + 3p^n + 0(l/n) 



+ 



^ 5- log 72 + O 



log n 



> 



> 



(/ — logn n f prrXogn f2f'f Qf 



p 

18/2 18/ 



2(gn3)2 2Q 
0x2 

pZ 



+ 



fV'f 









+ 

p pi 
logn 



logn 



> 0. 



If the process violates the bound ([2]) at step T = i then there exists a j < i such that T = Tj, 
X{T) = X{i) < and 

/(t(j))n2 logn 
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The submartingale X{j),X{j + 1), . . .X{Tj) has length at most n'^p{t{j))/Q and maximum 
one-step change 0(n^/^ log^''^ n). The probabihty that we violate the lower bound ([2]) is at 
most 



n'^p{t{j)) ■ n log n J ) { \ log n 

Finally, we turn to the co-degree estimate Yu^^. Let u,v be fixed. We begin with the 
upper bound. Our critical interval here is 



y{t)n + {f{t) - 5)^/n hgn , y{t)n + f{t)^n \ogn] . 



For a fixed j < iq we consider the sequence of random variables Z„ „(j), Z„ . . . , Zu^viTj) 

where 

Zu,v{^) = Yu,v{i) - y(t)n - f{t)^Jn logn 

and Tj is defined to be the minimum of iq, max{j, T} and the smallest index i> j such that 
Yu^v{i) is not in the critical interval (j8]). To see that this sequence forms a supermartingale, 
we note that i <T gives 

f{tYn^ logn 



\Q{i) - q{t)n'\ < 



pit) 



and therefore 



„r„ I IN ry / 'M ^ \^ Y^ ^ ~'t~ Yy ^ ^UV£E{i) 

E[Zu,v{t + 1) - Zu,v{l)\ <- 2^ 

x€N{u)nN{v) ^ 



- n {y{t + l/n^) - y{t)) - ^nlogn {f{t + Xjr?) - f{t)) 
^ 2{yn+ (f - 5)y/n\ogn){yn- fy/n\ogn) ^^f 1 



Q yri^p 



^ 2(yn + (/ - 5)^?^ \ogn){yn - fy/n hgn) 

^ 2 _ pn"^ logn _ (yn)^ _ y\t)_ _ ^ \og^^^ n ^ ^ 

lOyn^/^ log^/^ n X^pnXogn p. ( ^ \ w/xlog^^^^ 
— Ha ' ZI% ^ ^ ~y>Z ~ J V')' 



q"n? qiv" yri^p J rfil'^ 

To get the supermartingale condition we consider each positive term here separately. The 
following bounds would suffice 

60 r 84/^v/bi7I /' 1 / r 

p 6 p-^n^i^ i n^i^p \ 
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The first term requires 



p{t) 1 - 6t ■ 

We see that this requirement, together with the initial condition /(O) > 5, imposes 

fit) > 5 - 30 log(l - 6t) = 5 - 30 logp(t) . 

But this value for / also suffices to handle the remaining terms as we restrict our attention 
to p > po = 10n~^^'^\og^^^ n. Thus, we have established that Zu^ii) is a supermartingale. 

To bound the probability of a large deviation we recall a Lemma from [2]. A sequence of 
random variables Xq, Xi, . . . is [t], N)-bounded if for all i we have 

-T] < Xi+i -Xi<N. 

Lemma 3. Suppose = Xq, Xi, ... is an {rj, N)-bounded submartingale for some t] < N/10. 
Then for any a < rjm we have f^X^ < —d) < exp ( — a^/ {3r]Nm)^ . 

As — —Zu^v{j + 1), ... is a (6/n, 2)-bounded submartingale, the probability that we 
have T = Tj with Yu^v{T) > yn + fy/nAogn is at most 

f 25n log n ^ f 25 log n 



3 ■ (6/n) ■ 2 ■ (p(t(j))^V6) J I 6 

Note that there are at most n^ choices for j and the pair u,v. As the argument for the lower 
bound in is the symmetric analogue of the reasoning we have just completed. Theorem [2] 
follows. ■ 
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